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Abstract 

We prove algorithmic and hardness results for the problem of finding the largest set of 
a fixed diameter in the Euclidean space. In particular, we prove that if A* is the largest 
subset of diameter r of n points in the Euclidean space, then for every e > there exists 
a polynomial time algorithm that outputs a set B of size at least \A*\ and of diameter at 
most r(y/2 + e). On the hardness side, roughly speaking, we show that unless P = NP 
for every e > it is not possible to guarantee the diameter r(W4/S — e) for B even if the 
algorithm is allowed to output a set of size (|| — e) _1 |A*|. 
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1 Introduction 

The problem that we consider in this paper can be formulated as a clustering problem. These 
types of problems have been studied for quite long time and have many theoretical and practical 
applications in computer science [7] . A branch of clustering problems includes problems in which 
given a set of points the goal is to find a "cluster" (or clusters) with minimum size or maximum 
number of points. Typical examples of clusters include spheres, boxes, or any other shape of 
fixed complexity. Of course, the difficulty of the problem greatly depends on the definition 
of cluster. The clusters that we consider here are all the shapes of constant diameter. The 
diameter of a set S is defined as 

diam(S) = sup \x — y\, 

x,y£S 

where \x — y\ is the Euclidean distance between the two points x and y. Specifically, we consider 
the following problem: 

Problem 1: Let P be a set of n points in M. d and r > be a real number. Find a subset 
S C P of maximum size which satisfies diam(S') < r. 



A clique is a graph in which every two vertices are adjacent. For a graph G, let u(G) denote 
the size of the maximum clique in G, i.e. u)(G) is the maximum number of vertices of G 
such that every two of them are adjacent. Determining uj{G) is called the maximum clique 
problem. A closely related topic is the notion of an independent set. An independent set in 
G is a subset of vertices such that no two of them are adjacent. Let a(G) denote the size of 
the maximum independent set in G. Determining a(G) is called the maximum independent 
set problem. Denote by G c the complement of G, and note that 00(G) = a(G c ). Thus, the 
maximum clique problem for G is equivalent to the maximum independent set problem for G c . 
For more on these topics we refer the reader to [TT] . 

Problem 1 is equivalent to the maximum clique problem in disc graphs which are defined as 
follows. Given a point set P <zM. d and a parameter r, a disc graph G is defined by V(G) = P 
and xy G E(G) if and only if \x — y\ < r. There is a one to one correspondence between cliques 
in G and sets of diameter at most r in P. 

In Problem 1 the diameter is fixed and our objective is to maximize the number of points. 
On the other hand, we can fix the number of points and ask for the minimum diameter: 

Problem 2: Let P be a set of n points in M. d and k > be an integer. Find a subset 
S C P of size k with minimum diameter. 

We show that both these problems are NP-hard when the dimension is sufficiently large, i.e., 
for some d = O(logn). In fact, we prove a stronger result which shows that even certain ap- 
proximations of these problems are impossible unless P=NP. These approximations are defined 
in the following way: 

Definition. For t, s > 1, a (t, s) -approximation algorithm for Problem 1 is an algorithm that 
returns a set S of size at least Opt/t so that diam(S') < sr, where Opt is the size of the optimal 
answer to Problem 1. 

Definition. For t, s > 1, a (t, s) -approximation algorithm for Problem 2 is an algorithm that 
returns a set S of size at least k/t so that diam(S') <sx Opt, where Opt is the optimal answer 
to Problem 2. 

These two problems are obtained by relaxing the size and diameter constraints of the output 
set. A simple observation shows that these two problems are equivalent. 

Lemma 1 Fort, s > 1, there exists a polynomial time (t, s)- approximation algorithm for Prob- 
lem 1 if and only if there exists a polynomial time (t, s) - approximation algorithm for Problem 
2. 

Proof. Let A be a (t, s)-approximation algorithm for Problem 1. Consider an instance (P, k) 
of Problem 2. For every pair of points x,y G P, run A with parameter r X)V := \x — y\. We 
output the minimum r x ^ y for which the answer of A is of size at least k/t. Let S be the 
optimal solution to the (P, k) instance of Problem 2. At some point A is called with parameter 
r := diam(5' ). Now the output of A is a set of size at least \S Q \/t > k/t and with diameter at 
most rs = s x diam(5' ). 
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To prove the other direction let B be a (t, s) -approximation algorithm for Problem 2. Con- 
sider an instance (P,r) for Problem 1. A (t, s)-approximation algorithm for Problem 1 can be 
obtained in a similar way by running B for every k = 1, . . . ,n. ■ 

Since these two problems are equivalent we refer to both of them as the Diameter Approxi- 
mation Problem. 

Both Problems 1 and 2 are solvable in the 2-dimensional plane in polynomial time [2], [T], [8] . 
For Problem 2 the fastest known algorithm achieves the running time 0(n log n + k 2 ' b n log k) [8]. 
It is shown in pQ that in the 3-dimensional space there is a ( cos ~i 1 / 3 , l)-approximation algo- 
rithm. Finally, when the dimension d is a fixed constant, one can design a polynomial time 
approximation scheme achieving a (1,1 + e)-approximation, for every e > [Tj. It is also easy 
to see that there exists a trivial (1, 2)-approximation algorithm for this problem in any dimen- 
sion: a ball with radius r about a point x G P containing the maximum number of points is a 
(1, 2)-approximation for Problem 1. Thus, it is interesting to study at which point the problem 
turns from polynomially solvable to NP-hard. We have the following result in this direction: 

Theorem 2 For every e > there exists d = 0(logri), so that there is no polynomial time (§§ — 
e, a/4/3 — e)- approximation algorithm for the Diameter Approximation Problem in dimension 
d unless P=NP. 

We also improve upon the trivial (1, 2)-approximation algorithm and obtain the following 
theorem: 

Theorem 3 For every e > there is a polynomial time (1, \/2 + e)- approximation algorithm 
for the Diameter Approximation Problem in any dimension. 

In Section 12.11 we prove Theorem [2j We use spectral properties to move from combinatorics 
of graphs to geometry of Euclidean space. This technique in combination with a hardness re- 
sult regarding the maximum independent set problem in 3-regular graphs proves Theorem [2J In 
Section [2T21 we prove Theorem [3] using simple geometric techniques. Finally, in the "Corollaries" 
subsection we observe that having Theorem [3] in hand, it is possible to move in the other direc- 
tion. Corollaries [5] and [6] show that one can apply Theorem [3] to the geometric representation 
of the graph to approximate the maximum independent set problem for certain graphs. 

2 From Graphs to Euclidean Space 

In this section we prove Theorem [21 We show that unless P=NP, there is no (|| — e, \/4/3 — e)- 
approximation algorithm for the Diameter Approximation Problem for every e > 0. We use 
spectral techniques to show that a certain metric on the graph embeds isometrically into the 
Euclidean space. This type of reduction from geometry to graph theory via metric embedding 
has been successfully applied to various problems dealing with the so called (1 — 2)-metrics [TUl 
[To] . Our proof of Theorem [2] relies on the following result: 

Theorem 4 J3J/ For every e > 0, unless P=NP, there is no polynomial time algorithm that 
approximates the maximum independent set problem in 3-regular graphs within a factor o/|f— e. 
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2.1 Proof of Theorem H 



To prove the theorem we reduce the Diameter Approximation Problem to the maximum inde- 
pendent set problem in 3-regular graphs. 

Consider a 3-regular graph G. Denote by G c the complement of G, and notice that cliques in 
G c correspond to independent sets in G. We begin by finding a lower bound for the minimum 
eigenvalue of the adjacency matrix Aqc of G c . Denote by Ai < . . . < X n the eigenvalues of Aqc 
Since G c is an (n — 4)-regular graph, its Laplacian can be written as Lqc = (n — 4)J — Aqc 
Thus for a vector v, we have AqcV = Xv if and only if LqcV = (n — 4 — X)v. This shows that the 
eigenvalues of Lqc are n — 4 — A n < . . . < n — 4 — Ai. It is well-known [9] that the maximum 
eigenvalue of the Laplacian is bounded by n. So the minimum eigenvalue of Aq is at least —4. 
Define Q := Aqc + (4 + 7)/ for 7 > chosen sufficiently small. By the same argument that we 
used above to find the eigenvalues of Lqc, one can show that the smallest eigenvalue of Q is 
Xi + 4 + 7 > which means that Q is a symmetric positive definite matrix. This implies that 
there exists a nonsingular matrix U, which can be found using elementary techniques, such that 
Q = \J t U (see [11] page 285 for instance). Define a function / : [n] —> W" 1 by setting the value 
of f(i) to be the i-th row of the matrix U. Note that 

l/(«')| 2 = /(0 = = 4 + 7, 

and 

1/(0 - fU)\ 2 = \fd)\ 2 + l/0')| 2 - 2/(0 • M = 8 + 2 7 - 2Q U . 

Thus 

Consider the vertex Vi of V(G C ) which corresponds to the i-th row and column of A. Map Vi 
to the point f(i) in Euclidean space M. n . Let P be the resulting point set in M. n . The above 
properties imply that every vertex of V(G C ) is mapped to a vector of magnitude 2 and the 
distance between two vertices is y/6 + 27, if there is an edge between them, and y/8 + 27 if not. 

Using the Johnson-Lindenstrauss dimension reduction lemma [121 EE] there exists a dimension 
d = 0(A~ 2 logn), and a polynomial algorithm which maps P into M d such that the distance 
between any two points of P changes by a factor of at most 1 + A/2. Let g : V(G C ) — > M. d 
be the corresponding map. Thus if we choose A and 7 small enough, for every two vertices 
Vi,Vj E V(G C ) the distance between g(i) and g(j) is at most (1 + A)v^6 if they are connected 
in G c and at least (1 + A)" 1 -*/^ if they are not connected in G c . So for a set S C V(G°), 
its geometric representation will have diameter at most (1 + A)\/6 if it is a clique but it will 
have diameter at least (1 + A)" 1 -^ otherwise. By picking e to be small enough and applying a 
|| — e, a/4/3 — e)-approximation algorithm to Problem 1 with g(G c ) and r = a/6 we can find a 
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that this is impossible unless P = NP 



clique of size at least 1/(|| — e) times the size of the maximum clique in G c . Theorem H] shows 
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2.2 Proof of Theorem g] 

In this section we prove Theorem [3] by applying simple geometric techniques. We follow the 
general ideas and techniques of [TJ [2] , borrowing and generalizing the main tool from [2] . The 
idea is to extend and generalize the trivial (1, 2)-approximation. One way to interpret the (1, 2)- 
approximation is to say that any set A of diameter r can be placed inside a ball of diameter 2r 
centered at a point in A. To obtain a (1, y/2 + e)-approximation, we first show that any set of 
diameter r can actually be placed inside a ball of diameter (y/2 + e)r, and then we produce a 
polynomial time algorithm to compute such a ball. 

Let A be the optimal answer to Problem 1. We start by proving that A is inside a ball of 
diameter (y/2 + e)r. Let B(P,t) denote a ball of radius t centered at point P. At the beginning, 
Pi is an arbitrary point of A and thus we have A C P(Pi,r). At the 2-th step we assume 
A C B(Vi, n) for a V { e M. d and some value < r to be determined later. 

Let Pj + i be the point of A with maximum distance to V^. This implies, 

A C B(Vi, n) H B(P i+1 , r) n B{V h \Vi - P i+1 \) 
and since \Vi — Pi+i\ < r^, we have: 

Acfl(p i+ i,r)na(v i ,|v i -p i+1 |). 

If x = \Vi — Pi+i\ < ry2/2 then the set of points is inside a ball of diameter y/2r. So we 
assume x > ry/2/2. 

Consider a point L on the intersection of boundaries of the two balls B(Pi + i, r) and B(Vi, x) 
(Figured]). Consider the plane passing through L, P i+ i and Vj and draw the line LVi + i per- 
pendicular to the segment Pj + iV^. A simple calculation proves that: 

i i - i, «i= r V 1 -i? Sr v 1 -5? 



Define r^+i = ry 1 — It can be also verified that if x > r^/2/2 then |Pj+i — Vi+i\ < 

\V i+ i — L\ and the ball B(V i+ i, \L — V i+ i\) will contain the intersection B(P i+1 ,r) PI P(Vj,x). 
This implies A C B(Vi + i, \L — K+i|) C J3(Vi+i, r^+i). It is easy to check that the sequence 
Tii r 2, • • ■ converges to ry/2/2. Thus given any e > it is possible to fix a constant fc (depending 
only on e) such that rt < ry/2/2 + e. 

To obtain an algorithm from the discussion above, we only need to consider all different 
possible choices for Pi,...,Pfc. Discarding the invalid choices or choices that result in an 
invalid state, each choice for Pi, . . . , Pj. leads to a ball with radius at most ry/2/2 + e. Now 
the algorithm outputs the one which contains the maximum number of points. Since k is a 
constant, the algorithm is polynomial. 

2.3 Corollaries 

The following corollaries can be obtained using techniques employed in the previous section. 
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Figure 1: The position of V$+i with respect to the positions of Vi, Pi+i, and L. 

Corollary 5 Fix 5 > and let G be a graph such that there exists a mapping f : V(G) —> R n 
satisfying \f{u) — f(v)\ > v2 + 5 if uv £ E(G), and \f{u) — f(v)\ < 1 otherwise. Then it is 
possible to find the size of the maximum independent set of G in polynomial time. 

Proof. Let V(G) = {vi, . . . , v n }. Suppose that the i-th row of a matrix M be the vector f(vi). 
Let further A = MM 1 . Clearly, A is positive semi-definite, and \f{vi)—f(vj)\ 2 = Au+Ajj—2Aij. 
Thus finding the map / reduces to finding a positive semi-definite matrix A with 

An + Ajj - 2A tj >(V2 + 5) 2 Vv iVj £ E(G), 

and 

An + Ajj - 2Aij < 1 Vi ^ j, v lVj £ E(G). 

As in ([H], Theorem 3.2) the ellipsoid algorithm can be invoked to find such a matrix A. 

Then we apply the algorithm of Theorem [3] to f(V(G)) with the setting r = 1 and e = 5/2. 
Let / be an independent set of maximum size in G. Then the diameter of /(/) is at most 1 
because \f(u) — /(f) | < 1, if w ^ E(G). The algorithm given in the proof of Theorem [3] finds 
a set P of diameter \/2 + 5/2 whose size is at least |/|. Since \f(u) — f(v)\ > \[2 + 5 when 
uv £ E(G), we conclude that / _1 (P) is an independent set. This completes the proof. ■ 

Corollary 6 Fix e > and let G be a graph whose minimum eigenvalue is at least —2 + e. It 
is possible to find the size of the maximum independent set of G in polynomial time. 

Proof. By the proof of Theorem [2] every such graph satisfies the condition of Corollary [51 ■ 

3 Concluding Remarks 

• We believe that Theorem [3] is "almost" sharp in the sense that the constant ^4/3 — e 
in Theorem [2] can be improved to y/2 — e becoming arbitrarily close to the \f2 + e upper 
bound of Theorem [31 
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• The main idea behind the proof of Theorem [3] was to introduce a polynomial time al- 
gorithm that given n points computes a ball of diameter (y/2 + e)r which contains the 
largest subset of the points that has diameter at most r. The fact that such a ball exists 
was already known, and in fact stronger results have been obtained using Helly-type ar- 
guments (we refer the reader to [SI [15] for the proofs and the description of the Helly-type 
theorems). However, the novel part of the proof was the algorithmic aspect, and showing 
that there exists a polynomial time algorithm which finds such a ball. 

• There is already a body of work dedicated to characterization of all graphs with the 
smallest eigenvalue of at least —2 (see [3J @j). These graphs have been characterized as 
"generalized line graphs" plus some finite set of exceptions. This characterization gives 
an alternative proof for Corollary [SI which uses a different polynomial time algorithm. 
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